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Abstract 

We provide a simple and efficient algorithm for computing the Euclidean projection of a point onto 
the capped simplex, formally defined as 

min — 1| x — y 11 - s.t. x T l = s, 0 < x < 1, 

x6R® 2 

together with an elementary proof. Both the MATLAB and C++ implementations of the proposed algo¬ 
rithm can be downloaded at https://eng.ucmerced.edu/people/wwang5. 


1 The Problem 

In this report, we consider the following optimization problem 


min 

\ ||x - y|| 2 

(la) 

xGl D 

2 

s.t. 

x T l = s 

(lb) 


0 < x < 1, 

(lc) 


where s £ [0,i2] is a parameter of the problem, 0 and 1 are vectors of 0’s and l’s respectively, and < 
means elementwise comparison. The feasible set of this problem is the intersection of the unit cube and a 
hyperplane with normal 1. Alternatively, the feasible set is the simplex {x : x > 0 , x T l = s} with an 
additional capping constraints x < 1, so we call it the capped simplex. Problem [T] is a quadratic program and 
the objective function is strictly convex, so there is a unique solution which we denote by x = [xi ,..., io] T 
with a slight abuse of notation. 


Remark 1.1 . This problem is a slight generalization of the projection onto the probability simplex (see 


iDuchi et al. . 2008 : Wang and Carreira-Perninan , 2013 and the references therein), which is a special case of 


© by setting s = 1 and can be solved exa ctly with Q( D log D) time complexi ty. An elementary proof of 
the corresponding algorithm can be found in Wang and Carreira-Perpinanl ( 20131) and the cost mainly comes 
from sorting the dimensions of y. Our solution to © in this report is derived using a similar idea. 


Remark 1.2. The constraint 0 < x < 1 in © can be generalized to 0 < x < fl where t is any positive 
number. We only need to solve the following instance of ©: 


1 o -p 

min - ||x — y/t|| , s.t. x T l = s/t, 0 < x < 1, 

x 2 


and then scale its solution x by t to obtain the solution of the original problem. 


2 The algorithm 

We provide an 0(D 2 ) algorithm for solving dTJ in Algorithm [TJ 
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Algorithm 1 Euclidean projection of a vector onto the section of cube. 

Input: y G is sorted in ascending order: yi < yi < ■ ■ ■ < ud- 
1 : Set yo = — oo and yo +1 = oo, compute partial sums To = 0, and Tk = ]T j=i y ki k = 1,..., D. 
2 : for a = 0,1,..., D do 

3 : if (s == D - a) && ( y a+ i - y a > 1) then 

4 : Set b=a. 

5: break 

6 : end if 

7: for b = a + 1,..., D do 

8 : Compute 7 = s+b-n+'l a -'l b ' 

9: if (y a + 7 < o) && (y a +1 + 7 > 0) && (y b + 7 < 1) && ( y b+ i > 1) then 

10 : break 

ll: end if 

12 : end for 

13: end for 

Output: x = [0,..., 0, y a +1 + 7 ,..., + 7 ,1,..., 1]. 


3 The proof 


As m entioned earlier, ([T]) has a unique solution which is characterized by its KKT system (jNocedal and Wright . 
2006i h The Lagrangian function of the problem is 


T(x,a,/ 3 , 7 ) = | ||x — y || 2 - a T x -/3 T (1 - x) - 7 (l T x- s) 

where a = [au,. .., o_d] t and f3 = \/3i,..., /3d] t are the Lagrange multipliers for the inequality constraints 
x > 0 and 1 — x > 0 respectively, and 7 is the Lagrange multiplier for the equality constraint. At the 
optimal solution x the following KKT conditions hold: 


Oi-i + ft — 7 = 0, 

i = 1, 

..,D 

( 2 a) 

Xi > 0 , 

i = 1, 

..,D 

( 2 b) 

Xi < 1, 

i = 1, 

..,D 

( 2 c) 

P 

IV 

0 

i = 1, 

..,D 

( 2 d) 

ft > 0 , 

i = 1, 

..,D 

( 2 e) 

Xi = s > 

z — ' 1=1 



( 2 f) 


i = 1, 

..,D 

( 2 g) 

ft(l - Xi) = 0, 

i = 1, 

..,D, 

( 2 h) 


where d2g| and ( |2gl ) are complementary slackness (CS) conditions. 

Without loss of generality, we assume the components of the optimal solution x are in ascending order: 


0 X\ ■ x a x a _|_r ^ ^ x b ... xk) 1, 


(3) 


where a is the number of 0’s in the solution while D — b is the number of l’s in the solution. The valid 
ranges for (a, 6 ) are 0 < a < D and a < b < D. The KKT conditions can be simplified for different set of 
dimensions of the solution: 

(i) For i = 1,. .., a, the CS condition (|2h|) indicates ft = 0, and thus 

0 = Xi = yi + oti + 7 > yt + 7 , (4) 

where the last inequality uses the fact that cq > 0 . 
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(ii) For j = b + 1,..., D, the CS condition Pg| ) indicates atj = 0, and thus 

1 = Xj = yj - Pj + 7 < Vj + 7 , ( 5 ) 

where the last inequality uses the fact that j3j >0. 

(iii) For k = a + 1,..., b, the CS conditions indicate ak = Pk = 0, and thus 

0 < x k = yk + 7 < I- (6) 


It is then clear that for any l<i<a,b+l<j<D and a + 1 < k < 6 , we have 

Vi < -7 < Vk < 1 - 7 < Vj- ( 7 ) 

In other words, if the dimensions of y are sorted in ascending order, the corresponding dimensions of the 
solution x is also in ascending order. Therefore, the first step of our algorithm is to sort dimensions of y 
into ascending order. And all that is left is to find (a,b), the partition of x into the three segments. The 
only KKT condition we have not used so far is the sum constraint ©i, which now reduces to 

D b 

^2,Xi = a-0 + ^2 {y k + 1 ) + {n - b) -1 = s. (8) 

i =1 k=a-\- 1 

This means that if we know (a, b) for the solution x, we must have 

s + b-n - Efc=a +1 Vk fn . 

b — a 

Since there are only ( D + 1 ')( D + 2 > possible combinations for the indices (a, b), we could test each combination 
and compute the hypothesized 7 value using ©• With the hypothesized 7, the tests we need for (a, 6,7) to 
produce the optimal x are the following: 


2 /q + 7<0, y a + 1+7>0, y b + 7 < 1, y b + 1+7>!- 


( 10 ) 


It is easy to verify that the (a,b, 7) combination that passes the above test leads to (x, a,/3, 7) that satisfy 
all the KKT conditions, where {ai}“ =1 and {/3j}^L 6+1 can be retrieved using © and © respectively. 

Remark 3.1. A special case is when a = b, for which © is ill-defined. In this case, the optimal solution 
consists of a 0’s and n — a l’s. For this to happen, we must have s = D — a and then (flUl) reduces to 
2/o+i + 7 > 1 and y a + 7 < 0, and so 2/0+1 - Va > 1- 

Remark 3.2. The reason why the computational complexity of our algorithm is (D(D 2 ) for © as opposed 
to 0{D log D) for projection onto probability simplex is due to the constraint x < 1. This constraint is 
automatically satisfied for the projection onto probability simplex problem where s = 1, in which case we 
only need to figure out a —the number of zero dimensions in the solution. 


Remark 3.3. In view of the previous remark, another way of solving © is to alternatively project the estimate 
onto the (scaled) probability simplex {x : x > 0, x T l = s} and the set {x : x < 1} for which the projection 
is trivial to compute (we simply threshold the dimensions that are gr eater than 1 to 1) . And yet another 
approach is to apply the Alternating Direction Method of Multipliers (iBovd et all I201D . which introduces 
another copy of the variables x and alternately optimize each copy with simple steps while encouraging the 
two copies to agree. We note that these approaches are iterative and the number of iterations depends on 
the desired accuracy. On the contrary, our method finds the exact solution within a fixed number of steps. 


4 Experiment 


In this section, we demonstrate the effectiveness of our proposed method in Algorithm [1] for solving ©. 
We compare our method with another two solvers. The first one is the CVX package (IGrant and Bovd . 
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Table 1: Running time (in seconds) of different solvers for various sizes of y. 


Methods 

D = 50 

100 

500 

1000 

2000 

5000 

10000 

20000 

100000 

lsqlin 

0.049 

0.11 

8.18 

59.10 

360.20 

5078.00 

- 

- 

- 

CVX 

0.73 

1.01 

5.03 

7.41 

14.55 

40.94 

- 

- 

- 

Ours - MATLAB 

0.0005 

0.002 

0.023 

0.083 

0.44 

2.02 

11.30 

27.09 

870.39 

Ours - C++ 

0.00002 

0.00003 

0.0003 

0.0009 

0.005 

0.021 

0.11 

0.27 

8.78 


201211 . a general convex program solver which transforms the problem into a semi-definite program and then 
applies interior point method. And the second one is the MATLAB command lsqlin for solving constrained 
linear least squares. We implement Algorithm Q] in both MATLAB and C++ and conduct all experiments in 
MATLAB (the C++ code is compiled within MATLAB and the mex-file is used). 

All experiments are run on a PC with an Intel Core 2 Quad CPU Q9550 of frequency 2.83GH and 8GB 
main memory, under Windows 7 and MATLAB version 8.0. We generate y and s using the MATLAB 
commands 


y=rand(D,1)-0.5; 
s=round(rand*D); 

and record the running time of each method using tic and toe. We choose the dimension D of y from 
{50,100, 500,1000, 2000, 5000,10000, 20000,100000}. For each choice of D, the experiments are repeated 20 
times and the average running times are reported for comparison. 

The results are shown in Table [l] It can be seen that our proposed method is always faster than CVX and 
lsqlin for different dimensions D of y. And as D increases, the improvement of our method over the others 
becomes more significant. If D is relatively large, the compared solvers may run out of memory (denoted as 
’ in Table [[]). These results confirm that it is beneficiary to explore the special structures of our problem 
rather than using general convex program solvers. Furthermore, the C++ version of our method is by two 
orders of magnitude faster than the MATLAB version; this improvement is important for projections in very 
high dimensions. 
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